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THE STELLAR INITIAL MASS FUNCTION IN EARLY-TYPE GALAXIES FROM ABSORPTION LINE 
SPECTROSCOPY. I. DATA AND EMPIRICAL TRENDS 
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ABSTRACT 

The strength of gravity-sensitive absorption lines in the integrated light of old stellar populations is one of the 
few direct probes of the stellar initial mass function (IMF) outside of the Milky Way. Owing to the advent 
of fully depleted CCDs with little or no fringing it has recently become possible to obtain accurate mea- 
surements of these features. Here we present spectra covering the wavelength ranges 0.35 /im-0.55 /im and 
0.72 /im - 1 .03 /im for the bulge of M3 1 and 34 early-type galaxies from the S AURON sample, obtained with 
the Low Resolution Imaging Spectrometer on Keck. The signal-to-noise ratio is > 200 A" 1 out to 1 /im, which 
is sufficient to measure gravity-sensitive features for individual galaxies and to determine how they depend 
on other properties of the galaxies. Combining the new data with previously obtained spectra for globular 
clusters in M3 1 and the most massive elliptical galaxies in the Virgo cluster we find that the dwarf-sensitive 
Nal A8 183,8 195 doublet and the FeH A9916 Wing-Ford band increase systematically with velocity dispersion, 
while the giant-sensitive Call A8498, 8542, 8662 triplet decreases with dispersion. These trends are consistent 
with a varying IMF, such that galaxies with deeper potential wells have more dwarf-enriched mass functions. 
In a companion paper (Conroy & van Dokkum 2012) we use a comprehensive stellar population synthesis 
model to demonstrate that IMF effects can be separated from age and abundance variations and quantify the 
IMF variation among early-type galaxies. 

Keywords: cosmology: observations — galaxies: evolution 



1. INTRODUCTION 

The form of the stellar initial mass function (IMF) is of 
fundamental importance for many areas of astrophysics and 
one of the largest uncertainties in the interpretation of the in- 
tegrated light of stellar populations. The IMF is reasonably 
well constrained in the disk of the Milky Way as stars can be 
counted more or less directly. For the past decade the con- 
sensus has been that the Milky Way IMF is a powerlaw with 
a logarithmic slope of ~ 2.3 at M > IM®, with a gradual 
turnover at lower masses (see, e.g., Kroupa 2001; Chabrier 
2003). This turnover can be interpreted as a characteristic 
mass: in the Milky Way disk, the formation of stars with 
masses of a few tenths of the mass of the Sun is apparently 
favored over the formation of lower and higher mass stars. 
This departure from a powerlaw is important, as most of the 
stellar mass- and number density is in the form of low mass 
stars. As a result, apparently subtle changes in the form of 
the low mass IMF significantly alter the mass-to-light (M/L) 
ratios of galaxies. As an example, for the same total lumi- 
nosity, a Salpeter (1955) IMF with a constant slope of 2.3 
down to M = 0. 1 Mq implies a 1 .6 x higher stellar mass than 
a Chabrier (2003) IMF. 

It seems likely that the IMF in other present-day spiral 
galaxies is similar to that in the Milky Way disk, but that does 
not mean that the IMF has the same form in all galaxies and 
at all epochs. In particular, the most massive elliptical galax- 
ies have had very different star formation histories than spiral 
galaxies. Their central regions are thought to have formed 
in short-lived, highly dissipative events at high redshift (e.g., 
Naab et al. 2007; Hopkins et al. 2009; Kormendy et al. 2009; 
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van Dokkum et al. 2010; Oser et al. 2010). Densities, tem- 
peratures, turbulent velocities, and the dust and metal content 
were almost certainly different from conditions in the present- 
day Milky Way, which may well have led to a different char- 
acteristic stellar mass (e.g., Padoan & Nordlund 2002; Bate, 
Bonnell, & Bromm 2003; Larson 2005; Krumholz 201 1; My- 
ers et al. 2011). 

Motivated by these and other arguments, several recent 
studies have attempted to observationally measure or con- 
strain the form of the IMF in elliptical galaxies or their pro- 
genitors. In 2008, several papers argued that the IMF may 
have been "bottom-light" (dwarf-deficient) at early times, 
with a higher characteristic mass than the Milky Way IMF. 
van Dokkum (2008) [vD08] used the ratio of luminosity evo- 
lution to color evolution of massive galaxies in clusters to con- 
strain the IMF, a test first proposed by Tinsley (1980). Dave 
(2008) found that the specific star formation rates of galax- 
ies at z ~ 2 are difficult to explain in the context of galaxy 
formation models unless the characteristic mass was higher 
than today. Wilkins, Trentham, & Hopkins (2008), following 
Fardal et al. (2007), argued that the z = stellar mass density 
is lower than the integral of the cosmic star formation history, 
unless star formation estimates at high redshift overestimate 
the formation rate of low mass stars. 

Short of counting individual stars, the most direct way to 
constrain the low mass IMF is to detect and quantify the 
light emitted by dwarf stars. As has been known for a long 
time, this is possible thanks to gravity-sensitive absorption 
features whose strengths are different in dwarfs and giants 
(see, e.g., Spinrad 1962; Cohen 1978; Carter, Visvanathan, 
& Pickles 1986; Couture & Hardy 1993; Conroy & van 
Dokkum 2012). The strongest dwarf-sensitive features are the 
NaIA8183,8195 doublet (e.g., Faber & French 1980; Schi- 
avon et al. 1997a) and the FeH A9916 Wing-Ford band (e.g., 
Wing & Ford 1969; Schiavon, Barbuy, & Singh 1997b); 
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the strongest giant-sensitive feature (in the optical) is the 
Ca II A8498,8542,8662 triplet (e.g., Cenarro et al. 2003). This 
work is technically challenging as dwarfs contribute only 5 % 
- 10% of the integrated light of stellar populations. There- 
fore, a 30 % absorption feature in the spectra of dwarf stars 
has a depth of only a few percent in integrated light. De- 
tecting IMF variations therefore requires line measurements 
with exquisite accuracy (< 0.3 %) in spectral regions that are 
plagued by strong sky emission and (typically) poor detector 
performance. 

Owing to the advent of fully depleted, high resistivity CCDs 
it has recently become possible to measure absorption lines 
in the far red with the required accuracy to detect variations 
in the dwarf-to-giant ratio. Using the upgraded red arm of 
the Low Resolution Imaging Spectrograph on Keck (LRIS; 
Oke et al. 1995; Rockosi et al. 2010) we found that mas- 
sive elliptical galaxies in the Virgo and Coma clusters have 
enhanced Na I and Wing-Ford band absorption compared to 
metal-rich globular clusters and to expectations from stellar 
population synthesis models, indicating that the IMF in these 
galaxies is "bottom-heavy" with respect to that of the Milky 
Way (van Dokkum & Conroy 2010, 2011; Conroy & van 
Dokkum 2012). This result is consistent with constraints on 
the masses of early-type galaxies as derived from stellar dy- 
namics and lensing (e.g., Treu et al. 2010; Auger et al. 2010; 
Spiniello et al. 201 1; Thomas et al. 201 1; Dutton, Mendel, & 
Simard 2012; Spiniello et al. 2012; Cappellari et al. 2012). 
However, it is opposite to the conclusions from the 2008 stud- 
ies. 

The main uncertainties in our initial study (van Dokkum & 
Conroy 2010) are the small sample size (four galaxies in Virgo 
with Na I and Wing-Ford measurements and four galaxies in 
Coma with Na I measurements), the relatively low signal-to- 
noise ratio (S/N) of the spectra, and the fact that our modeling 
did not allow for abundance variations of individual elements. 
As discussed in detail in Conroy & van Dokkum (2012) abun- 
dance variations can be separated from IMF variations by an- 
alyzing different absorption lines of the same element. As an 
example, the Wing-Ford band depends on the IMF but also on 
[Fe/H], and by comparing the strength of the Wing-Ford band 
to other iron lines the two variables can be separated. This 
approach requires high quality data and a stellar population 
synthesis model that allows simultaneous fitting of individual 
elemental abundances, the IMF, and other stellar population 
parameters. 

In the present paper we describe newly obtained high S/N 
Keck spectroscopy of 34 early-type galaxies spanning a large 
range in velocity dispersion and abundance patterns. We 
show that dwarf- and giant-sensitive absorption lines can be 
measured accurately for individual galaxies, and compare the 
measurements to the most massive Virgo ellipticals and to 
M31 globular clusters. In an Appendix we re-assess the re- 
sults of vD08 in the context of recent results by van der Wei 
et al. (2008) and Holden et al. (2010), as well as the Conroy 
& van Dokkum (2012) stellar population synthesis models. In 
a companion paper (Conroy & van Dokkum 2012b [paper II]) 
we fit the new Keck spectra with the stellar population syn- 
thesis model of Conroy & van Dokkum (2012) to quantify the 
IMF variation. 



2. SAMPLE AND OBSERVATIONS 
2. 1 . Sample Selection 



The primary sample comprises a subset of the early-type 
galaxies observed in the SAURON survey (Bacon et al. 2001; 
de Zeeuw et al. 2002). Specifically, we observed 34 of 
the 48 E/S0 galaxies listed in Table 1 of Kuntschner et al. 
(2010). As discussed in de Zeeuw et al. (2002) the SAURON 
parent sample is not complete but was selected to span a 
large range in ellipticity and absolute magnitude. We could 
not observe the entire Kuntschner et al. (2010) sample be- 
cause of visibility constraints and the fact that we only had 
a single night of LRIS time. The 14 galaxies that were ex- 
cluded are NGC3032, NGC3156, NGC3489, NGC4150, 
NGC 4526, NGC 4550, and NGC 5831, because we gave pref- 
erence to galaxies with ages > 9Gyr; NGC 4374, NGC 4387, 
NGC 4477, and NGC 5198, as we gave preference to galaxies 
with metallicity 0.05 < Z < 0.20 and a-enhancement 0.15 < 
[a/Fe] < 0.25 in that same part of the sky; and NGC 5982, 
NGC 7332, and NGC 7457, as they were not observable in 
January. 

The sample is compared to the SAURON parent sample in 
Fig. [TJ The LRIS sample discussed in this paper comprises 
most of the SAURON sample, with an intentional bias against 
the youngest galaxies (which tend to be fast rotators with 
low dispersions and low a-enhancements). Using the derived 
quantities of Kuntschner et al. (2010) for r < r e /8, the galax- 
ies in the LRIS sample have a median age of 1 1.2 Gyr, a me- 
dian metallicity [Z/H] of 0.08, and a median a-enhancement 
of 0.24. It contains 7 slow rotators and 27 fast rotators, as 
defined by Emsellem et al. (2007). 
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Figure 1. Sample selection. The points show the full SAURON sam- 
ple of Kuntschner et al. (2010) in the plane of a-enhancement versus 
age within r e j%. The symbol size scales with the velocity dispersion. 
Circles denote slow rotators and ellipses denote fast rotators. Black 
points were observed with LRIS; grey points (mostly young galaxies 
with low velocity dispersions) were not observed. 

In addition to the SAURON galaxies we observed the cen- 
tral regions of the bulge of M3 1 . M3 1 has played an important 
if somewhat confusing role in the decades-long quest to con- 
strain the low mass end of the IMF through absorption line 
spectroscopy. Spinrad & Taylor (1971) and Faber & French 
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(1980) suggested that the nuclear regions have a very large 
population of low mass stars, largely based on anomalously 
strong Nal A8 183,8 195 absorption. However, other studies 
have shown that the effects of metallicity complicate the in- 
terpretation (e.g., Cohen 1978; Carteret al. 1986). 

2.2. Observing Strategy 

The galaxies were observed on the night of January 21, 
2012 with LRIS on the Keck I telescope. The 680 nm dichroic 
was used to split the light into the blue and red arms. In 
the blue arm the 6001mm" 1 grism, blazed at 4000 A, gave 
a spectral coverage of 3000 A - 5600 A. In the red arm the 
6001mm" 1 grating blazed at 10,000 A was set to cover the 
wavelength range 7 100 A - 10,400 A. We used a relatively 
narrow (0"7) slit to maximize the spectral resolution. This is 
not important for resolving lines in the galaxy spectra as they 
all have a > lOOkms" 1 , but very helpful when correcting for 
sky emission and absorption. The spectral resolution <7i nst r, as 
measured from sky emission lines, is « 60 km s" 1 in the blue 
arm (at 5500 A) and ranges from w 65kms _1 at 7200 A to 
K> 45kms _1 at 9500 A in the red arm. The red detector was 
binned by a factor two in the spatial direction to reduce read- 
out time. After applying the same binning (in software) to 
the data from the blue detector the pixel scales are identical at 
0"27. 

Each galaxy was observed for 540 s, split into three 180 s 
exposures. The telescope was moved along the slit between 
exposures, such that each galaxy was observed in two posi- 
tions on one of the two detectors and in one position on the 
other detector. The slit was always positioned along the minor 
axis of the galaxy to minimize galaxy light near the edges of 
the slit and facilitate sky subtraction. The white dwarf GD 153 
(see Bohlin 1996) was observed to correct for the wavelength 
variation in the detector and instrument response. At the be- 
ginning and end of the night arc lamp exposures were ob- 
tained for wavelength calibration. Conditions were clear; we 
note that our initial data for this project (van Dokkum & Con- 
roy 2010) were taken through clouds and therefore had signif- 
icantly lower S/N ratio than the data described here. 

3. DATA REDUCTION 
3.1. Overview 

Although the LRIS long slit has a length of 168" the ef- 
fective slit length is much shorter. The reason is that the 
LRIS blue and red CCDs are both mosaics, and the slit is 
projected onto two independent detectors. On the blue side, 
75" is imaged on one detector, 14" falls in a gap between 
two detectors, and 79" falls on another detector. The detec- 
tors have slightly different characteristics, and due to optical 
distortions and flexure the relation between pixel location and 
wavelength needs to be determined independently for each 
~ half of the slit. As the flexure varies with the position of the 
telescope, each 540 s sequence effectively comprises 12 inde- 
pendent exposures: 2 detectors x 2 arms x 3 dither positions. 
Another consequence of the somewhat peculiar slit geometry 
is that the galaxies often extend to the edges of the slit halves, 
complicating the sky subtraction. 

Within these constraints the data reduction followed fairly 
standard procedures: bias subtraction, using the overscan re- 
gions; correction for s-distortion; wavelength calibration, us- 
ing a combination of arc lamps and the location of sky emis- 
sion lines; subtraction of a 2D model of the sky lines; cosmic 



ray identification and combination of the individual science 
exposures in a sequence; extraction of one-dimensional spec- 
tra, mimicking a circular aperture of r = r c /8; correction for 
detector and instrument response; and correction for atmo- 
spheric absorption. This last step is one of the most critical, 
as a near-perfect correction is required for our purposes. The 
steps are detailed below. The red and blue spectra were treated 
in the same way, unless noted otherwise. 

3.2. Distortion Correction and Wavelength Calibration 

After bias subtraction the spectra were placed on an undis- 
torted output grid that is linear in the wavelength and spatial 
axes. The s-distortion was mapped by fitting the position of 
the galaxy in the spatial direction with a Gaussian at 50-pixel 
intervals, and then fitting a 3 d -order polynomial to the mea- 
sured positions. A two-dimensional wavelength solution was 
obtained from arc lamp exposures, by fitting 3 d -order poly- 
nomials in the wavelength direction and the spatial direction. 
The median residual is typically w 0.2 A; higher order poly- 
nomials did not improve the fit. These arc lamp solution cap- 
ture the tilt of the sky lines and the distortion in the wave- 
length direction, but flexure in the spectrograph causes off- 
sets of typically w 2 A between the science exposures and arc 
lamps. Bright sky emission lines were used to find the zero- 
order correction for these differences between the arc lamps 
and the science exposures. A small linea r cor rection was ap- 
plied after extraction of the spectra (see § 13.71 . 

The spectra were mapped onto an output grid with 1 A pix- 
els in the wavelength direction and 0."27 pixels in the spatial 
direction, using linear interpolation. This resampling method 
mostly conserves the noise properties of the data and the sharp 
edges of cosmic rays, and does not introduce significant alias- 
ing in sky lines which would complicate their subtraction. 

3.3. Sky Subtraction 

The subtraction of sky emission lines is complicated by the 
fact that the galaxies cover a substantial fraction of the slit. 
Sky lines were subtracted from the 2D spectra in several steps, 
making use of the fact that each of the two detectors has at 
least one e xpos ure in each sequence with negligible galaxy 
light (see § 12. 2K First, a ID model of the sky was created 
using these sky exposures, by median filtering in the spatial 
direction. Next, the spatial variation in the sky exposures was 
modeled using a low order polynomial. A 2D sky model for 
each detector was generated by replicating the ID model in 
the spatial direction, weighted by the polynomial fit. This 
(nearly) noise-free model of the sky emission for each detec- 
tor was then subtracted from the galaxy exposures. 

This procedure is effective in removing sky emission with- 
out affecting the galaxy light. However, residuals remain, due 
to the variation in the intensity of sky lines on timescales of 
a few minutes: the sky exposure is typically about 4 minutes 
removed in time from the galaxy exposures on the same de- 
tector. These sky line residuals were removed by subtracting 
the average of the two edges of the detector at each wave- 
length. This step reduces the variation in the sky lines to the 
photon noise for most galaxies, but it comes at a cost. For 
the largest galaxies there is still detectable galaxy light at the 
edge of the slit, and by subtracting this light we reduced the 
utility of the data for measuring gradients in absorption fea- 
tures. Furthermore, if the gradients are strong the subtraction 
alters the observed absorption line strengths. The maximum 
effect on absorption lines occurs when the gradient is such 
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that the absorption feature vanishes at the edge of the slit. In 
that case, the observed absorption at radius r will be increased 
by a factor of F e( j ge /F r , with F e d ge the (subtracted) galaxy flux 
at the edge of the slit and F r the galaxy flux at radius r. The 
galaxy flux at the edge is always <C 1 % of the average flux 
in our extraction aperture; this effect is therefore negligible in 
our analysis. 

3.4. Cosmic Ray Identification and Combination of 
Exposures 

Cosmic rays and other defects were identified in the follow- 
ing way. First, a model for the galaxy light was created by tak- 
ing the median of the three individual exposures and then me- 
dian filtering in the spatial direction. After subtraction of this 
model, residual galaxy light was removed by fitting and sub- 
tracting a 7 -order polynomial in the spatial direction. The 
resulting residual exposure contains noise and cosmic rays. 
Next, a 2D-model of the total flux in each exposure was cre- 
ated from the 2D sky model and the galaxy model. This to- 
tal flux model M was converted to a noise model N through 
N = g~ X y/M x g, with g the gain. A pixel was flagged as a cos- 
mic ray when its flux in the residual exposure was 7 x higher 
than the flux in the noise model. Due to the linear resampling 
cosmic rays are slightly smoothed with respect to the origi- 
nal exposures; to take this into account all pixels neighboring 
cosmic ray pixels were also flagged. 

The three exposures of each galaxy were summed to create 
a combined, sky subtracted 2D output frame. Pixels affected 
by a cosmic ray in one exposure were replaced by 1 .5 x the 
sum of the other two exposures. In the rare cases where two 
exposures were affected by a cosmic ray the pixel in the out- 
put frame is 3 x the flux in the unaffected exposure. The lo- 
cations of pixels that were affected by a cosmic ray in at least 
one of the three exposures are stored for diagnostic purposes. 

3.5. Extraction of Spectra and Flux Calibration 

One-dimensional spectra were extracted from the 2D spec- 
tra by summing the flux in the spatial direction. An extrac- 
tion aperture of r e /8 was used; the effective radii were taken 
from Kuntschner et al. (2010) and corrected to the minor axis. 
This aperture is also used by the S AURON survey (along with 
larger apertures), allowing direct comparisons to their results. 
A "straight" summation of the long slit spectrum over the 
range -r f /8 < r < r e /8 would be weighted more towards the 
center of the galaxy than the summation in a circular aperture 
of SAURON. To mimic summation in a circular aperture with 
radius r = r e j% we extracted the spectrum as follows: 

1 -2 n 

Fx = J2 F *>y + J2-y F ^ + Y,y F *>» (1) 

v=-l y=-n y=2 

with y the pixel coordinate in the spatial direction (with the 
galaxy centered in the middle of pixel zero) and n the nearest 
integer number of pixels corresponding to r e j%. The first term 
in Eq. Q] is a straightforward sum of the central three rows, 
corresponding to a rectangular aperture of 0."81 x 0."70. The 
other two terms extend the summation to r e /8, weighting by 
the distance from the central row. Weighted in this manner 
the spectra can be compared directly to the SAURON mea- 
surements, and represent a larger fraction of the galaxy light. 
For an r 1 / 4 law, a circular aperture of r e /8 contains ~ 10 % of 
a galaxy's total flux. 



Next, the extracted spectra were calibrated using a response 
curve that produces a fiat spectrum for an object with constant 
F\. No absolute calibration was attempted, but given the rapid 
fall-off of the detector response at wavelengths > 9600 A it is 
important to obtain a reasonably accurate relative calibration 
as a function of wavelength. The response curve was created 
as follows. First, a boxcar-smoothed high S/N ratio halogen 
lamp spectrum was used to model the small-scale variations 
in the response. This spectrum is shown by the dashed line 
in Fig. |2 Next, the extracted spectrum (corrected for atmo- 
spheric absorption; see below) of the white dwarf GD 153 was 
divided by the halogen-derived response curve and then di- 
vided by the Rayleigh-Jeans approximation of its spectrum 
(F\ oc A -4 ). The residual spectrum was fitted by a low order 
polynomial. The final response curve, indicated by the solid 
line in Fig. [2] was created by multiplying this polynomial by 
the halogen-derived response curve. From a comparison of 
the galaxy spectra to the stellar population synthesis models 
of paper II we estimate that the relative uncertainty in the cal- 
ibration as a function of wavelength is < 5 % on ~ 1000 A 
scales. 
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Figure 2. Response curve that was used to calibrate the red spectra 
(solid line), as determined from the white dwarf GD 153. Division by 
the response curve produces a flat spectrum for an object with con- 
stant F\ . The dashed line shows a high S/N halogen lamp spectrum 
which was used to capture the small-scale variation in the response. 



3 .6. Correction for Atmospheric Absorption 

As we aim to measure weak stellar absorption lines with 
high accuracy it is crucial to correct for absorption in our own 
atmosphere. The strongest atmospheric absorption features in 
the optical are the "A" and "B" bands of O2 at ~ 6870 A and ~ 
7600 A respectively. However, the weaker but numerous H2O 
bands in the regions 8 100 A - 8400 A and 8900 A - 9800 A 
are a more serious challenge for our program, as several key 
features (in particular the Na I doublet) fall in this wavelength 
region. 
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Figure 3. Illustration of our correction for atmospheric absorption, for NGC 5846. (a) Template atmospheric absorption spectrum, (b) Galaxy 
spectrum (black) and template absorption spectrum (red) in the ~ 9350 A region, where the absorption is strong. The template is scaled to 
match the observed galaxy spectrum in this wavelength region, (c) Galaxy spectrum before and after division by the scaled absorption template, 
near the Na I doublet. The blue line shows the best-fitting stellar population synthesis model from paper II. 



The standard method to correct for atmospheric absorption 
is to observe blue stars that are located near the science tar- 
gets in the sky. Typically a star is observed before and after 
the target, so that the varying absorption can be interpolated 
to match the time and sky position of the science observation 
(see, e.g., Kriek et al. 2008). In practice telluric standards 
are usually A stars, as O and B stars are rare and typically 
not available in the general direction of the science target. 
This procedure suffers from several drawbacks. A stars have 
strong Paschen lines at wavelengths > 8200 A, which need 
to be divided out before the spectrum can be used to model 
the atmospheric absorption. They also have weak metal lines 
which can introduce systematic errors at the 0.5 % - 1 % level. 
Finally, telluric standards carry significant overhead, particu- 
larly given the requirement that they are observed before and 
after each science target. 

Here we take a different approach, and correct for atmo- 
spheric absorption by scaling a template spectrum to the ob- 
served absorption. The scaling is parameterized by 



7>=l+/(7b-l), 



(2) 



with / a scale factor and To a template absorption spectrum. 
For each galaxy the best-fitting value of / is found by mini- 
mizing \G — Tf\, with G the galaxy spectrum. The fit is done 



over the wavelength range 9320 A - 9380 A, as this region 
is dominated by strong atmospheric H2O lines and does not 
contain strong (redshifted) galaxy absorption features. Prior 
to the minimization G and Tf are divided by a 4 th -order poly- 
nomial fit in the wavelength range 9250 A - 9650 A. The tem- 
plate To is appropriate for Mauna Kea and smoothed to the 
instrumental resolution. 

The procedure (also explored in Schiavon 1998) is illus- 
trated in Fig. [3] for one of the sample galaxies (NGC 5846). 
The theoretical absorption template is shown in panel (a). 
Panel (b) shows the detailed region of the template around the 
strongest absorption lines, where the fit is done to determine 
the best match to the galaxy spectrum. The top spectrum in 
panel (c) shows the significant effect of atmospheric absorp- 
tion lines on the region around the Nal A8183,8195 feature. 
After division by the scaled template (red) the narrow sky 
lines are nearly perfectly removed. The blue line is the best- 
fitting template from paper II. An analysis of the residuals of 
these fits in regions affected by sky absorption shows that the 
absorption correction is good to about 5 - 10 % of the feature 
strength. At our spectral resolution the strongest absorption 
feature in the 8000 A - 8500 A range is — 10%, which im- 
plies that the largest residuals are 0.5- 1 %. As the width of 
sky absorption lines is typically ~ 20 % of the width of galaxy 
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absorption features, this results in a 0. 1 - 0.2 % uncertainty in 
the strength of a galaxy absorption line that coincides with a 
relatively strong sky absorption line. 

3.7. Optimizing the Wavelength Calibration 

The standard wavelength calibration as described in § 13.21 
is correct to approximately ±1 pixel (±30kms -1 ) over the 
entire wavelength range. Although this level of accuracy is 
sufficient for most purposes, it is a source of uncertainty in 
the analysis in paper II. The reason for this sensitivity to the 
wavelength calibration is that in our methodology a template 
spectrum is directly fit to the observed spectrum. As a result, 
a small error in the wavelength calibration increases the x 2 
value of the fit. In the fitting this increase can be partially 
compensated by changing the line strength in the model, thus 
potentially leading to erroneous abundances and other fit pa- 
rameters. 

We optimized the wavelength calibration by fitting an a- 
enhanced, 13.5 Gyr old stellar population synthesis model to 
the blue and red spectra in narrow wavelength regions. The 
spectra were divided in ~ 20 regions, each with a width of 
250 A. The model was smoothed to the velocity dispersion of 
the galaxies and fit to each of the regions, with velocity as the 
only free parameter. The deviations from the average velocity, 
expressed in A, were fit with a linear function in wavelength. 
Each galaxy was fit separately, and the red and blue spectra 
were treated independently. The linear fits to the corrections 
were then applied to the wavelengths of the extracted spec- 
tra. The fit procedure is illustrated in Fig. |4] for four galaxies 
that exhibit the full range of corrections. The slope of the re- 
quired corrections broadly correlates with the time of night 
(and hence with the NGC number of the galaxies), presum- 
ably because the true wavelength calibration deviates more 
and more from the arc lamp solution. After this correction the 
systematic errors are less than lOkms -1 over the full wave- 
length range from 3500 A to 10,000 A for all galaxies. 

4. EXTRACTED SPECTRA 

The extracted spectra are shown in Fig. [5] ordered by in- 
creasing velocity dispersion. The quality is generally high: 
essentially all the visible features in the spectra (except in the 
far blue and far red) are spectral lines, not noise. Neverthe- 
less, it is clear that some spectra have a higher S/N ratio than 
others. This variation largely stems from the fact that each 
galaxy was observed for 540s irrespective of its brightness. 
The region around 9500 A suffers from strong sky absorption 
(see Fig. |3^), and is shown in light grey. 

For a correct interpretation of the model fits in paper II it is 
crucial to have realistic estimates of the noise in the spectra. 
The formal noise was determined from Poisson statistics, tak- 
ing the gain, the sky spectrum, and the applied weighting (Eq. 
Q} into account. It is difficult to test whether the actual noise 
corresponds to the expected noise, as the observed variation 
in the spectra is almost entirely due to the "forest" of weak 
absorption lines present in the atmospheres of cool stars. We 
empirically determined the noise properties of the spectra in 
the following way. We selected two galaxies with very simi- 
lar velocity dispersions, [Fe/H], and [Mg/Fe] (as determined 
in paper II), and an average S/N ratio which is typical for the 
full sample. The galaxies NGC 4570 and NGC 4660 satisfy 
all these critera. A small part of their spectra, centered around 
the calcium triplet, is shown in Fig. |6ja). As expected from 
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Figure 4. Errors in the wavelength calibration, resulting from 
wavelength- and time-dependent differences between the arc lamp 
solution and the galaxy spectra. The points were determined from 
model fits in 250 A wide spectral regions; the four galaxies that are 
shown span the full range of variation in the errors. Lines indicate 
linear fits (in wavelength) to the errors; the residuals after this cor- 
rection are ~ lOkms -1 . 

the selection the spectra are very similar, although NGC 4570 
has slightly stronger absorption lines than NGC 4660. 

The difference between the two normalized spectra is 
shown in Fig. |6j5. The la scatter in the difference spec- 
trum is 0.0059 A -1 , as determined with the biweight estima- 
tor (e.g., Beers, Flynn, & Gebhardt 1990). Assuming each 
galaxy contributes equally to the scatter this corresponds to 
a fa 0.0041 A" 1 for each individual galaxy. In Fig. [6J: the 
scatter is calculated in bins of 20 A and divided by the av- 
erage formal error in the two spectra. A value of 1 implies 
that the observed differences between the spectra can be fully 
explained by Poisson statistics. The median is 1.1 over the 
displayed wavelength range and 1 .2 over the entire 7800 A - 
10,200 A range, which means that the formal errors are a good 
approximation of the actual uncertainties in the spectra. Sim- 
ilar comparisons of other galaxy pairs consistently show that 
the formal errors are reasonable, particularly in regions away 
from strong sky emission or absorption lines. 

Having verified that the formal S/N ratios are reasonable, 
we show the S/N as a function of wavelength for all galaxies 
in Fig. [7] As expected there is significant variation between 
galaxies and between wavelengths. The S/N ratio ranges from 
~ 60 in the blue for the worst spectra to ~ 500 in the red for 
the best spectra. The median S/N of all galaxies is 90 A" 1 at 
3800A,256A" 1 at 5000 A, 310 A" 1 at 8500 A, and 211 A" 1 at 
10,000 A. There is no obvious trend with velocity dispersion. 

5. ANALYSIS OF IMF-SENSITIVE FEATURES 

Here we investigate the observed variation in the well- 
established IMF-sensitive features Nal A8183,8195, the 
Ca II A8498,8542,8662 triplet, and the FeH A9916 Wing-Ford 
band. The Nal doublet and the Wing-Ford band are strong 
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Figure 5. Extracted LRIS spectra in the rest-frame, using a (circularized) r < r e j% aperture. Starting at the top galaxies are ordered by increasing 
velocity dispersion. The spectra are normalized at 4050 A (blue side) and 9050 A (red side). The region of strong sky absorption around 9500 A 
is shown in light grey. The blue cutoff in the displayed red spectra is dictated by the onset of the atmospheric B band at ~ 7600 A. The spectra 
are of high quality, and extend beyond 1 /im in the rest-frame. 
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Figure 6. (a) Comparison of the spectra of two galaxies with similar 
ages, abundances, and velocity dispersions, in the wavelength region 
near the Call triplet. The spectra have a S/N ratio of ~ 250 A~', 
which is typical for our sample, (b) Difference of the two spectra. 
Apart from a slight difference in Call absorption the variation be- 
tween the spectra is very small and appears mostly random. The dif- 
ferences (divided by \/2) have a 68 % range of ±0.4 %. (c) Observed 
scatter in (b) divided by the expected noise from Poisson statistics. 
Away from strong absorption features the observed differences be- 
tween NGC 4660 and NGC 4570 are fully consistent with the formal 
errors. This demonstrates that the formal errors are a good approxi- 
mation of the uncertainties in the spectra. 

in dwarf stars and weak in giants; conversely, the Call triplet 
is strong in giants and weak in dwarfs. As a result, galaxies 
with bottom-heavy IMFs are expected to have stronger Na I, 
stronger Wing-Ford, and weaker Ca II absorption than galax- 
ies with bottom-light IMFs (see Couture & Hardy 1993; Ce- 
narro et al. 2003; van Dokkum & Conroy 2010; Conroy & 
van Dokkum 2012, and many other studies). 

5.1. Dispersion Matching 

Prior to measuring the strength of absorption lines the 
galaxies have to be smoothed to the same velocity dispersion. 
Velocity dispersions of the individual galaxies were measured 
directly from the extracted spectra, taking the instrumental 
resolution and the resolution of the template into account (see 
paper II). All galaxies except M87 were smoothed to a com- 
mon resolution, using a Gaussian of width 




a. ! = x /3002-(a2 + a i 2 str ), 



(3) 



4000 



Figure 7. S/N ratio per rest-frame A for all spectra. For indi- 
vidual galaxies the S/N is fairly uniform between ~ 4500 A and 
~ 10, 000 A, but there is considerable variation between galaxies. 
The spectra are color-coded according to their velocity dispersion, 
from low = blue to high = red. There is no strong correlation be- 
tween S/N ratio and velocity dispersion. 

with 0% the stellar velocity dispersion and erj nstr the instrumen- 
tal resolution. This smoothing has the potential to broaden lo- 
calized sky line residuals, thus "contaminating" the spectrum 
on 300km/s scales. To prevent this, and to reduce the effect of 
sky line residuals on measured absorption line indices, pixels 
coinciding with strong sky lines were not taken into account in 
the smoothing. This was done iteratively, in each iteration re- 
placing pixels in the unsmoothed spectrum that coincide with 
sky lines by pixels from the smoothed spectrum of the previ- 
ous iteration. 

5.2. Nal, the Wing-Ford band, and Call 

The three IMF sensitive absorption lines are detected with 
high significance in all galaxies, as shown in Fig. [8] The top 
panels show the spectra at their original spectral resolution; 
in the bottom panels they are smoothed to a common veloc- 
ity dispersion of 300 km s" 1 . For clarity only one line of the 
calcium triplet is shown. The spectra were divided by a linear 
fit to the side-bands of the features (see below). The bulge of 
M3 1 and M87 are not shown; NGC 3414 and NGC 3608 were 
also excluded because they have unexplained noise peaks in 
their Nal (NGC 3414) and Wing-Ford (NGC 3608) regions. 

The black and yellow lines in Fig. [8] illustrate the IMF- 
sensitivity of these lines, using stellar population synthesis 
models from Conroy & van Dokkum (2012). Both models 
have an age of 13.5 Gyr, a Solar iron abundance, and are a- 
enhanced with [a/Fe] = 0.2. The black model is for a Chabrier 
(2003) IMF and the yellow model is for a bottom-heavy IMF 
with a logarithmic slope x = 3. The data span a similar range 
as these model predictions. More to the point in the context of 
the present paper, the data quality is sufficiently high to mea- 
sure the subtle differences in absorption line strength expected 
from IMF variations. 

As can be seen in the bottom left panel of Fig. [8] the line 
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Figure 8. IMF-sensitive absorption features. Nal and FeH are strong in dwarf stars and weak in giants; the calcium triplet is strong in giants and 
weak in dwarfs. Top panels are at the original resolution; in the bottom panels the spectra are smoothed to a common dispersion of 300 km s~ . 
The spectra are color-coded by their velocity dispersion, going from blue (low) to red (high). The black and yellow lines show expectations for 
13.5 Gyr old, Q-enhanced stellar populations with two different IMFs (Milky Way and bottom-heavy). The dispersion-matched spectra have 
sufficiently high S/N ratio to distinguish between these predictions. 



profile of the observed Na I feature changes with its depth: 
its centroid is bluer when the absorption is stronger. We show 
this relation between the strength of Na I and its centroid (here 
simply defined as the wavelength of maximum absorption) 
explicitly in Fig. [9] The relation arises because Nal is a blend 
of the NaIA8183,8195 doublet and the TiO (0,2) bandhead 
at a resolution of 300 km s" 1 (see Schiavon et al. 1997a). As 
the sodium absorption becomes stronger the centroid of the 
feature shifts toward the Na I doublet and away from the TiO 
bandhead. This effect is illustrated in the inset of Fig. [9] which 
shows the effect of increasing the number of dwarf stars (and 
hence the sodium doublet strength) on the measured feature 
at 300 km s" 1 resolution. The existence of the tight relation 
in Fig. [9] demonstrates that the observed variation in the Nal 
feature strength is driven by variation in Na I absorption, not 
variation in the TiO band. 

The relation between centroid and feature strength also pro- 
vides an empirical upper limit to the errors in the line strength 
measurements. Excluding the two most deviant points, we 
find that the centroid of the feature predicts its strength with 
an rms scatter of only 0.0033 (dashed line in Fig. [9). This 
is a strict upper limit on the error as errors in the centroid 
measurements, variations in the TiO bandhead, and variation 
in other absorption features in the central band or side bands 
all contribute to this scatter. The tight relation has another 
interesting implication. As will be shown in paper II the 
form of the IMF correlates with the Na I strength. Therefore 
one could, in principle, infer the IMF from a simple, model- 
independent measurement: the centroid of the Na I feature at 
a resolution of 300 km s -1 . 
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Figure 9. Correlation between the centroid of the Na I feature and 
its strength. The inset shows the origin of the correlation. The blue 
line is for a bottom-light IMF with weak Na I absorption and the red 
line is for a bottom-heavy IMF with strong Nal absorption. The 
broken lines show the same spectra at 300 km s resolution. At this 
resolution the doublet is blended with a TiO bandhead at ~ 8205 A, 
and increased Nal absorption moves the centroid of the measured 
absorption feature toward that of the bluer Na I doublet. 
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Figure 10. (a-cj: The Na I doublet, the Wing-Ford band, and the Ca II triplet in the M6 dwarf Gliese 406 (red) and the M4 giant HD 4408 (blue). 
For a larger dwarf-to-giant ratio Na I and Wing-Ford are expected to be stronger and Ca II is expected to be weaker. ( d-f): The strength of these 
features in integrated light, as a function of velocity dispersion. Black dots are individual SAURON galaxies. The purple triangle represents 
metal-rich globular clusters in M31 (van Dokkum & Conroy 2011), and the orange square is measured from the average spectrum of high- 
dispersion elliptical galaxies in the Virgo cluster (van Dokkum & Conroy 2010). Nal and Call show strong and opposing trends, consistent 
with more bottom-heavy IMFs for galaxies with higher dispersions. The relation between the Wing-Ford band and a is only significant when 
the globular clusters and the massive Virgo galaxies are included. 



5.3. Correlations With Velocity Dispersion 

The spectra in Fig.[8]are color-coded by their velocity dis- 
persion. In the top panels the dispersion is trivially related to 
the width of the feature; this is particularly obvious for the 
calcium line in the top right panel. In the bottom panels the 
spectra are smoothed to the same dispersion, and yet trends 
with the galaxies' velocity dispersion remain: galaxies with 
high velocity dispersions tend to show strong Nal absorption 
and weak Ca triplet absorption. 

We analyze these trends by measuring the absorption line 
strengths of the IMF-sensitive features. Such line strengths 
are useful for highlighting trends of specific absorption fea- 
tures in the data. However, we note here that line indices are 
notoriously difficult to interpret quantitatively as they rarely 
measure the abundance of a single element in a straightfor- 
ward way. Their central bands or side bands typically contain 
faint lines of other elements (e.g., Schiavon et al. 1997a; Kel- 
son et al. 2006); they suffer from well-documented degen- 
eracies between age and abundance (e.g., Worthey 1994); and 
multiple features of the same element are typically needed to 
differentiate abundance effects from IMF effects (Conroy & 
van Dokkum 2012). For these reasons we do not use line in- 
dices in paper II, where we quantify the IMF, but fit the galaxy 
spectra directly with comprehensive stellar population synthe- 
sis models. 

The relation between the strength of IMF sensitive features 
and velocity dispersion is shown in Fig.[l0] The line strengths 
are defined as the average absorption over a central band, with 
the continuum determined by interpolating between two side 
bands. The errorbars are a combination of the formal Pois- 
son uncertainty (which dominates for the Wing-Ford band and 
Ca II) and the uncertainty introduced by the atmospheric ab- 
sorption correction (which dominates for Nal). For Nal and 



the Wing-Ford band we use the same central and side band 
definitions as in van Dokkum & Conroy (2010). For Ca II we 
use the definitions of Conroy & van Dokkum (2012). Also 
shown are measurements from stacked spectra of metal-rich 
globular clusters in M31 from van Dokkum & Conroy (201 1) 
and of high velocity dispersion elliptical galaxies in the Virgo 
cluster from van Dokkum & Conroy (2010). 

The strength of the two dwarf-sensitive features (Na I and 
the Wing-Ford band) systematically increases with velocity 
dispersion. M3 1 globular clusters have the weakest absorp- 
tion, the four Virgo ellipticals from van Dokkum & Con- 
roy (2010) have the strongest absorption, and the SAURON 
galaxies fall in between. By contrast, the strength of the giant- 
sensitive Call triplet decreases with velocity dispersion, as 
was found earlier by Cenarro et al. (2003). These trends are 
consistent with a systematically increasing dwarf contribution 
with it. We note, however, that the Wing-Ford band does not 
show a significant correlation with a within the SAURON 
sample (i.e., disregarding the globular clusters and the most 
massive ellipticals). The correlation coefficient is positive 
(0.12) but not significant. By contrast, the probability that 
the (anti-)correlations of Na I and Ca II with a are caused by 
chance are 0.3 % and 0. 1 % respectively. 

The trends within the SAURON sample are graphically il- 
lustrated in Fig. QT| which shows the variation in the galaxy 
spectra ordered by velocity dispersion. It is remarkable that 
the IMF-sensitive Na I and Ca II features show the strongest 
variation of any lines in the red. Figure QT|also highlights the 
well-known fact that many other spectral features show sys- 
tematic trends with a: most notably the Balmer lines and the 
[OIII] emission lines, but also Mg A5270 and a host of other 
metal lines (see, e.g., Trager et al. 2000b, and many other 
studies). 
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Figure 11. Variation in the spectra of the SAURON galaxies. The top 
panels show the average de-redshifted spectrum of all sample galax- 
ies, smoothed to a common velocity dispersion of 300kms~' and 
divided by a 5th order polynomial. The greyscale shows the differ- 
ences between individual galaxy spectra and this averaged spectrum, 
after smoothing and median filtering. For clarity the variation in the 
red is increased by a factor of three compared to the variation in the 
blue. Of all spectral lines at A > 7800 A the IMF-sensitive Nal and 
Call features show the strongest trends with a. 

6. SUMMARY AND CONCLUSIONS 

In this paper we presented deep spectroscopy of a sample 
of early-type galaxies in the nearby Universe, obtained with 
LRIS on Keck I. The spectra are weighted in such a way that 
they are representative for a circular aperture of radius r = 
r e /8. Owing to the fully depleted LBNL detectors in the red 
arm of LRIS the S/N ratio of the spectra is high all the way 
to ~ 1 /im. The high S/N ratio and the absense of fringing 
make it possible to measure absorption lines with < 0.5 % 
uncertainty in the far red. The reduced spectra are available 
upon request. 

The analysis in the present paper is limited to a relatively 
qualitative assessment of IMF-sensitive spectral features in 
the red part of the spectra. The Nal A8183,8195 doublet and 
the FeH A9916 Wing-Ford band are strong in dwarfs and weak 
in giants, whereas the Call A8498, 8542,8662 triplet is weak 
in dwarfs and strong in giants. We find that all three features 
show considerable variation within the sample. Na I and the 
Wing-Ford band vary by a factor of ~ 2. When abundance 
and age variations are ignored, this variation directly trans- 
lates into a variation of a factor of ~ 2 in the number of low 
mass stars. Call varies only by ~ 10 %, but this is expected as 



giants dominate the light. As part of the analysis we demon- 
strate that the variation in the Na I feature is indeed due to 
variation in the strength of the Nal doublet and not driven 
by the neighboring TiO bandhead (see, e.g., Schiavon et al. 
1997a, for a discussion of this issue)Q 

The variation in IMF-sensitive features correlates with the 
velocity dispersion of the galaxies: a higher velocity disper- 
sion implies stronger Na I, a stronger Wing-Ford band, and 
weaker Ca II. The anti-correlation of Ca II and velocity dis- 
persion was previously discussed by Cenarro et al. (2003), 
who also interpreted it as a possible IMF effect. These results 
extend our earlier measurements of very massive ellipticals in 
the Virgo cluster (van Dokkum & Conroy 2010) and metal- 
rich globular clusters in M31 (van Dokkum & Conroy 201 1); 
these previous studies "bookend" the SAURON galaxies at 
very high and very low dispersions respectively. 

As shown in Conroy & van Dokkum (2012) it is hazardous 
to derive quantitative IMF constraints from these three fea- 
tures alone, as age and abundance variations contribute to the 
observed absorption line strengths. The Wing-Ford band is 
sensitive to the Fe abundance, Nal is sensitive to [Na/Fe], 
and the Ca II triplet is very sensitive to [Ca/Fe] (and the over- 
all a-enhancement). All three indices also depend on age, in 
complex ways (see, e.g., Fig. 12 in Conroy & van Dokkum 
2012a). In our initial paper on the most massive galaxies in 
Virgo and Coma we mostly ignored these effects, which was 
perhaps justified because the IMF effects were so strong in 
that sample. However, it is clear that the trends in Fig. [10] 
to some extent reflect the correlations of age and metal line 
abundances with velocity dispersion (see, e.g., Trager et al. 
2000a; Thomas et al. 2005; Kelson et al. 2006; Sanchez- 
Blazquez et al. 2006; Graves, Faber, & Schiavon 2009; Scott 
et al. 2009; Worthey, Ingermann, & Serven 2011, and many 
other studies). 

In a companion paper (Conroy & van Dokkum 2012b) we 
use a comprehensive stellar population synthesis model to 
quantify the IMF variation among the early-type galaxies dis- 
cussed in the present paper. This model allows for abundance 
variations of individual elements, which is critical as it re- 
moves the ad-hoc assumption that we understand relative ele- 
mental abundances better than we understand the IMF. Fur- 
thermore, we fit the entire spectrum of each galaxy rather 
than line indices, which means that blended spectral lines are 
treated correctly. 

The data presented herein were obtained at the W. M. Keck 
Observatory, which is operated as a scientific partnership 
among the California Institute of Technology, the University 
of California and the National Aeronautics and Space Admin- 
istration. The Observatory was made possible by the generous 
financial support of the W. M. Keck Foundation. The authors 
wish to recognize and acknowledge the very significant cul- 
tural role and reverence that the summit of Mauna Kea has 
always had within the indigenous Hawaiian community. We 
are most fortunate to have the opportunity to conduct obser- 
vations from this mountain. 



4 Although we can resolve this particular issue, the Na I ambiguity illus- 
trates the difficulty of interpreting line indices: essentially all indices reflect 



a canopy of blended spectral lines. 
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APPENDIX 
COMPARISON TO VAN DOKKUM (2008) 

Building on many previous studies of the fundamental plane (e.g., Djorgovski & Davis 1987; van der Wei et al. 2004; van 
Dokkum & van der Marel 2007) and the color-magnitude relation (e.g., Bower, Lucey, & Ellis 1992a; Stanford, Eisenhardt, & 
Dickinson 1998; Holden et al. 2004), van Dokkum (2008) [vD08] constrained the slope of the IMF near 1 M Q in early-type 
galaxies by comparing their luminosity evolution to their color evolution. This test, first proposed by Tinsley (1980), is based 
on the expectation that luminosity and color evolution depend on the IMF in different ways. As discussed in Tinsley (1980) and 
in § 2.1 of vD08 a more bottom-heavy IMF should lead to slower luminosity evolution and faster color evolution. Therefore, a 
comparison of luminosity evolution to color evolution of a sample of galaxies should provide strong constraints on the slope of 
the IMF near the main sequence turn-off (w 1 Mq). 

The application of this test to massive early-type galaxies in clusters at < z < 1 yielded a surprising result: the slow rest-frame 
U-V color evolution of the galaxies and fast evolution of their rest-frame M/Lg ratios seemed to imply an IMF that i s defi cient 
in low mass stars ("bottom-light") compared to the IMF in the Milky Way. The key result from vD08 is shown in Fig. lAlfa . The 
green line is the predicted evolution of a Maraston (2005) model with super-Solar metallicity ([Z/H] = 0.35) and a Salpeter IMF. 
This model is not a good fit to the data: the green line has a slope a = Alog(M / Lb) / A(U — V) = 1.5, whereas a fit to the data gives 
a = 2.6. From Eq. 5 and 6 in vD08 it follows that the slope of the IMF near 1 Mq is in the range 0.1 < x < 1.3 depending on the 
metallicity, where x = 2.3 is the value for both a Milky Way IMF and a Salpeter IMF0 

This finding is in apparent conflict with the results in van Dokkum & Conroy (2010, 201 1) and paper II. Contradictory results 
are not exactly uncommon in this particular field (examples can be found in the review by Bastian, Covey, & Meyer 2010). 
However, in this case the contradiction is rather extreme (bottom-light versus bottom-heavy with respect to the Milky Way) and 
applies to the exact same galaxies (massive early-type galaxies in clusters)0 Note that the results of vD08 are not in conflict with 
recent mass measurements of early-type galaxies (Treu et al. 2010; Cappellari et al. 2012; Spiniello et al. 2012), as a bottom-light 
IMF and a bottom-heavy IMF can result in very similar M/L ratiosQ 




A (U-V) A (U-V) 

Figure Al. Color and luminosity evolution of early-type galaxies in clusters, (a) Evolution at fixed dynamical mass, for galaxies with M > 
10" Mq. This panel is nearly identical to Fig. 5 in van Dokkum (2008); the only difference is that no corrections for progenitor bias were 
applied. Lines show expectations for a Salpeter (1955) IMF, for a high metallicity Maraston (2005) model with [Z/H] =0.35 (green) and for an 
a-enhanced Conroy & van Dokkum (2012a) model with [Fe/H] = and [a/Fe] = 0.2 (red). The black line is the best fit to the data; grey regions 
indicate the 68 % and 95 % confidence limits of the best-fitting slope. The models predict lower luminosity evolution at fixed color evolution 
than observed, although a CvD model with "frosting" of young stars comes close to the data, (b) Evolution at fixed velocity dispersion, for 
galaxies with a > 200 km s , and including the Virgo cluster. The CvD models are a satisfactory fit to the data. 

Here we update the data and models of vD08 and examine whether they can be brought into agreement with the absorption line 
studies that indicate heavy mass functions. Compared to the analysis in vD08 the following changes were made: 

1 . The [Z/H] = 0.35 Maraston (2005) model was replaced by an [Fe/H] = 0, [a/Fe] = 0.2 Conroy & van Dokkum (2012) [CvD] 
model. As shown in Fig. lAll the M/L ratio in the CvD model evolves slightly faster in the age range 5-9 Gyr, although the 

desire to confirm * the conclusions of the 2008 paper using a more direct 
method. 

| This did not work out quite as expected. 

7 For a bottom-light IMF the "extra" mass is not in the form of low mass 
stars but is comprised of the remnants of high mass stars (white dwarfs, neu- 
tron stars, and black holes). 



Note that in vD08 the IMF was defined such that the Salpeter form cor- 
responds to a slope of 1.35. 

6 For those readers who failed to notice this: the 2008 and 2010 studies 
also have the same first author. ' 

f This is not a coincidence: the 2010 paper was partly motivated by the 
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difference is small. It remains true that stellar population synthesis models are in reasonable agreement on the evolution of 
rest-frame optical colors and luminosities. 

2. A model with mild "frosting" of young stars was created (dashed line in Fig. lAll Based on Lick indices Trager, Faber, & 
Dressier (2008) finds that early-type galaxies in the Coma cluster have relatively young absolute luminosity-weighted ages, 
similar to early-type galaxies in the general field. A possible explanation is that early-type galaxies have a small fraction 
of relatively young stars in addition to a dominant old population (see, e.g., Trager et al. 2000a). The dashed line is for a 
model in which 80 % of the stars have ages between 3 and 13.5 Gyr and 20 % of the stars are 3 Gyr old. 

3. In vD08 there was only one nearby cluster, Coma, for which both accurate U — V colors and M/L ratio measurements 
wer e ava ilable. We added the Virgo cluster to have another datapoint in the Alog(M/Lg) ~ A(U -V) ~ region of 
Fig. IA1I Effective radii and surface brightnesses were obtained from Burstein et al. (1987). The effective radii are in 
excellent agreement with the data in Cappellari et al. (2006). Surface brightnesses were corrected from the average surface 
brightness within r e to the surface brightness at r e and corrected for cosmological surface brightness dimming . A distance 
of 16.5 Mpc was assumed, based on surface brightness fluctuations measured in the ACS Virgo Cluster Survey (Mei et al. 
2007). Velocity dispersions were obtained from Davies et al. (1987), multiplied by 0.95 to undo the aperture correction, 
and then corrected to a 3."4 diameter circular aperture at the distance of Coma (see J0rgensen, Franx, & Kjsergaard 1995). 
The U — V colors were obtained from Bower, Lucey, & Ellis (1992b). 

4. A key assumption in vD08 was that structural evolution of the galaxies could be ignored. As discussed in Holden et al. 
(2010) this assumption is important: the measured color and luminosity evolution can be different from the true evolution 
if the masses and sizes of the galaxies change with time. Following earlier results for field galaxies (e.g., Daddi et al. 2005; 
Trujillo et al. 2006; van Dokkum et al. 2008) there is now evidence for size evolution in clusters, such that cluster galaxies 
of a given mass were smaller at higher redshifts (van der Wei et al. 2008; Strazzullo et al. 2010; Raichoor et al. 2012). 
Whether this applies to all clusters is still unclear, as is the question whether the size evolution is driven by infall from 
the field or changes to individual cluster galaxies (e.g., van der Wei et al. 2009). However, it does suggest that structural 
evolution needs to be considered. To address this issue Holden et al. (2010) measured the color- and M/L evolution of 
early-type galaxies at fixed velocity dispersion rather than mass, reasoning that the velocity dispersion is probably a more 
stable parameter (see, e.g., Bezanson et al. 201 1). From a comparison of the Coma cluster to a single cluster at z = 0.83 
they found that the color and M/L evolution of galaxies at fixed dispersion is only 2.3cr removed from expectations of a 
Salpeter IMF. We now follow Holden et al. (2010) and measure offsets in color and M/L ratio from the U -V - a and 
M/Lb - a relations for galaxies with a > 200 km s" 1 . 

The results of these updates are shown in Fig. lAlb . The data are now in much better agreement with a Salpeter IMF. The best- 
fitting relation has a slope of a = 1.81 ±0.27, which means that the high-metallicity Maraston (2005) model with a Salpeter IMF 
is only 1.2cr removed from the data. The Conroy & van Dokkum (2012) model is in even better agreement, particularly if some 
frosting is included. We infer that the luminosity and color evolution of massive early-type galaxies does not rule out IMFs with 
Salper-like slopes near ~ 1 M , contrary to the conclusions of vD08. 



Note that in calculating M L ratios we are still assuming that homology is . , ,~ n im , D •, , , , nn l i\ 

,. , . , 6 ' . , . . . 6 _ ,f J et al. (2010) and Buitraso et al. (2011). 

conserved, which is almost certainly incorrect; see, tor instance, van Dokkum 



VAN DOKKUM & CONROY 



15 



REFERENCES 



Auger, M. W., Treu, T., Gavazzi, R., Bolton, A. S., Koopmans, L. V. E., & 

Marshall, P. J. 2010, ApJ, 721, L163 
Bacon, R., Copin, Y., Monnet, G., Miller, B. W., Allington-Smith, J. R., 

Bureau, M., Carollo, C. M., Davies, R. L., et al. 2001, MNRAS, 326, 23 
Bastian, N., Covey, K. R., & Meyer, M. R. 2010, ARA&A, 48, 339 
Bate, M. R., Bonnell, I. A., & Bromm, V. 2003, MNRAS, 339, 577 
Beers, T. C, Flynn, K., & Gebhardt, K. 1990, AJ, 100, 32 
Bezanson, R., van Dokkum, R G, Franx, M., Brammer, G. B., Brinchmann, 

J., Kriek, M., Labbe, I., Quadri, R. F., et al. 201 1, ApJ, 737, L31+ 
Bohlin, R. C 1996, AJ, 111, 1743 

Bower, R. G, Lucey, J. R., & Ellis, R. S. 1992a, MNRAS, 254, 601 
— . 1992b, MNRAS, 254, 589 

Buitrago, F, Trajillo, I., Conselice, C. J., & Haeussler, B. 2011, ArXiv e- 
prints 

Burstein, D., Davies, R. L., Dressier, A., Faber, S. M., Stone, R. R S., Lynden- 
Bell, D., Terlevich, R. J., & Wegner, G. 1987, ApJS, 64, 601 

Cappellari, M., Bacon, R., Bureau, M., Damen, M. C, Davies, R. L., de 
Zeeuw, P. T., Emsellem, E., Falcon-Barroso, J., et al. 2006, MNRAS, 366, 
1126 

Cappellari, M., McDermid, R. M., Alatalo, K., Blitz, L., Bois, M., Bournaud, 
F., Bureau, M., Crocker, A. F, et al. 2012, ArXiv e-prints 

Carter, D., Visvanathan, N, & Pickles, A. J. 1986, ApJ, 311, 637 

Cenarro, A. J., Gorgas, J., Vazdekis, A., Cardiel, N., & Peletier, R. F. 2003, 
MNRAS, 339, L12 

Chabrier, G. 2003, PASP, 115, 763 

Cohen, J. G. 1978, ApJ, 221, 788 

Conroy, C. & van Dokkum, P. 2012, ApJ, 747, 69 

Couture, J. & Hardy, E. 1993, ApJ, 406, 142 

Daddi, E., Renzini, A., Pirzkal, N, Cimatti, A., Malhotra, S., Stiavelli, M., 

Xu, C, Pasquali, A., et al. 2005, ApJ, 626, 680 
Dave, R. 2008, MNRAS, 385, 147 

Davies, R. L., Burstein, D., Dressier, A., Faber, S. M., Lynden-Bell, D., 

Terlevich, R. J., & Wegner, G. 1987, ApJS, 64, 581 
de Zeeuw, P. T., Bureau, M., Emsellem, E., Bacon, R., Carollo, C. M., Copin, 

Y, Davies, R. L., Kuntschner, H., et al. 2002, MNRAS, 329, 513 
Djorgovski, S. & Davis, M. 1987, ApJ, 313, 59 
Dutton, A. A., Mendel, J. T., & Simard, L. 2012, MNRAS, 422, L33 
Emsellem, E., Cappellari, M., Krajnovic, D., van de Ven, G, Bacon, R., 

Bureau, M., Davies, R. L., de Zeeuw, P. T., et al. 2007, MNRAS, 379, 

401 

Faber, S. M. & French, H. B. 1980, ApJ, 235, 405 

Fardal, M. A., Katz, N., Weinberg, D. H., & Dave, R. 2007, MNRAS, 379, 
985 

Graves, G. J., Faber, S. M., & Schiavon, R. P. 2009, ApJ, 698, 1590 
Holden, B. P., Stanford, S. A., Eisenhardt, P., & Dickinson, M. 2004, AJ, 127, 
2484 

Holden, B. P., van der Wei, A., Kelson, D. D., Franx, M., & Illingworth, G. D. 

2010, ApJ, 724, 714 
Hopkins, P. F, Hernquist, L., Cox, T. J., Keres, D., & Wuyts, S. 2009, ApJ, 

691, 1424 

J0rgensen, I., Franx, M., & Kjasrgaard, P. 1995, MNRAS, 276, 1341 
Kelson, D. D., Illingworth, G. D., Franx, M., & van Dokkum, P. G. 2006, 
ApJ, 653, 159 

Kormendy, J., Fisher, D. B., Cornell, M. E., & Bender, R. 2009, ApJS, 182, 
216 

Kriek, M., van Dokkum, P. G, Franx, M., Illingworth, G. D., Marchesini, D., 

Quadri, R., Rudnick, G, Taylor, E. N., et al. 2008, ApJ, 677, 219 
Kroupa, P. 2001, MNRAS, 322, 231 
Krumholz, M. R. 2011, ApJ, 743, 110 

Kuntschner, H., Emsellem, E., Bacon, R., Cappellari, M., Davies, R. L., de 
Zeeuw, P. T, Falcon-Barroso, J., Krajnovic, D., et al. 2010, MNRAS, 408, 
97 

Larson, R. B. 2005, MNRAS, 359, 21 1 
Maraston, C. 2005, MNRAS, 362, 799 



Mei, S., Blakeslee, J. P., Cote, P., Tonry, J. L., West, M. J., Ferrarese, L., 

Jordan, A., Peng, E. W., et al. 2007, ApJ, 655, 144 
Myers, A. T, Krumholz, M. R., Klein, R. I., & McKee, C. F. 201 1, ApJ, 735, 

49 

Naab, T, Johansson, P. H., Ostriker, J. P., & Efstathiou, G. 2007, ApJ, 658, 
710 

Oke, J. B., , , & ETAL. 1995, PASP, 107, 375 

Oser, L., Ostriker, J. P., Naab, T., Johansson, P. H., & Burkert, A. 2010, ApJ, 
725, 2312 

Padoan, P. & Nordlund, A. 2002, ApJ, 576, 870 

Raichoor, A., Mei, S., Stanford, S. A., Holden, B. P., Nakata, F, Rosati, P., 
Shankar, F, Tanaka, M., et al. 2012, ApJ, 745, 130 

Rockosi, C, Stover, R., Kibrick, R., Lockwood, C, Peck, M., Cowley, 
D., Bolte, M., Adkins, S., et al. 2010, in Society of Photo-Optical 
Instrumentation Engineers (SPIE) Conference Series, Vol. 7735, Society 
of Photo-Optical Instrumentation Engineers (SPIE) Conference Series 

Salpeter, E. E. 1955, ApJ, 121, 161 

Sanchez-Blazquez, P., Gorgas, J., Cardiel, N., & Gonzalez, J. J. 2006, A&A, 
457, 787 

Schiavon, R. P. 1998, PhD thesis, Universidade de Sao Paulo, Brazil (1998) 
Schiavon, R. P., Barbuy, B., Rossi, S. C. F, & Milone, A. 1997a, ApJ, 479, 
902 

Schiavon, R. P., Barbuy, B., & Singh, P. D. 1997b, ApJ, 484, 499 

Scott, N., Cappellari, M., Davies, R. L., Bacon, R., de Zeeuw, P. T, 
Emsellem, E., Falcon-Barroso, J., Krajnovic, D., et al. 2009, MNRAS, 
398, 1835 

Spiniello, C, Koopmans, L. V. E., Trager, S. C, Czoske, O., & Treu, T. 201 1, 

MNRAS, 417, 3000 
Spiniello, C, Trager, S. C, Koopmans, L. V. E., & Chen, Y. 2012, ArXiv 

e-prints 

Spinrad, H. 1962, ApJ, 135, 715 

Spinrad, H. & Taylor, B. J. 1971, ApJS, 22, 445 

Stanford, S. A., Eisenhardt, P. R., & Dickinson, M. 1998, ApJ, 492, 461 
Strazzullo, V, Rosati, P., Pannella, M., Gobat, R., Santos, J. S., Nonino, M., 

Demarco, R., Lidman, C, et al. 2010, A&A, 524, A17 
Thomas, D., Maraston, C, Bender, R., & Mendes de Oliveira, C. 2005, ApJ, 

621, 673 

Thomas, J., Saglia, R. P., Bender, R., Thomas, D., Gebhardt, K., Magorrian, 

J., Corsini, E. M., Wegner, G, et al. 201 1, MNRAS, 415, 545 
Tinsley, B. M. 1980, Fundamentals of Cosmic Physics, 5, 287 
Trager, S. C, Faber, S. M., & Dressier, A. 2008, MNRAS, 386, 715 
Trager, S. C, Faber, S. M., Worthey, G., & Gonzalez, J. J. 2000a, AJ, 120, 
165 

Trager, S. C, Faber, S. M., Worthey, G, & Gonzalez, J. J. 2000b, AJ, 119, 
1645 

Treu, T, Auger, M. W., Koopmans, L. V. E., Gavazzi, R., Marshall, P. J., & 

Bolton, A. S. 2010, ApJ, 709, 1195 
Trajillo, I., Forster Schreiber, N. M., Rudnick, G, Barden, M., Franx, M., 

Rix, H.-W., Caldwell, J. A. R., Mcintosh, D. H., et al. 2006, ApJ, 650, 18 
van der Wei, A., Bell, E. F, van den Bosch, F. C, Gallazzi, A., & Rix, H.-W. 

2009, ApJ, 698, 1232 
van der Wei, A., Franx, M., van Dokkum, P. G, & Rix, H.-W. 2004, ApJ, 

601, L5 

van der Wei, A., Holden, B. P., Zirm, A. W., Franx, M., Rettura, A., 

Illingworth, G. D., & Ford, H. C. 2008, ApJ, 688, 48 
van Dokkum, P. G. 2008, ApJ, 674, 29 
van Dokkum, P. G. & Conroy, C. 2010, Nature, 468, 940 
— . 2011, ApJ, 735, L13 

van Dokkum, P. G, Franx, M., Kriek, M., Holden, B., Illingworth, G. D., 
Magee, D., Bouwens, R., Marchesini, D., et al. 2008, ApJ, 677, L5 

van Dokkum, P. G. & van der Marel, R. P. 2007, ApJ, 655, 30 

van Dokkum, P. G, Whitaker, K. E., Brammer, G, Franx, M., Kriek, M., 
Labbe, I., Marchesini, D., Quadri, R., et al. 2010, ApJ, 709, 1018 

Wilkins, S. M., Trentham, N., & Hopkins, A. M. 2008, MNRAS, 385, 687 

Wing, R. F. & Ford, Jr., W. K. 1969, PASP, 81, 527 

Worthey, G. 1994, ApJS, 95, 107 

Worthey, G, Ingermann, B. A., & Serven, J. 2011, ApJ, 729, 148 



